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Abstract 

Think  Like  a  Commander  -  Excellence  in  Leadership 
(TLAC-XL)  is  an  application  designed  for  learning 
leadership  skills  both  from  the  experiences  of  others  and 
through  a  structured  dialogue  about  issues  raised  in  a 
vignette.  The  participant  watches  a  movie,  interacts  with  a 
synthetic  mentor  and  interviews  characters  in  the  story.  The 
goal  is  to  enable  leaders  to  learn  the  human  dimensions  of 
leadership,  addressing  a  gap  in  the  training  tools  currently 
available  to  the  U.S.  Army.  The  TLAC-XL  application 
employs  a  number  of  Artificial  Intelligence  technologies, 
including  the  use  of  a  coordination  architecture,  a  machine 
learning  approach  to  natural  language  processing,  and  an 
algorithm  for  the  automated  animation  of  rendered  human 
faces. 

Leadership  Development 

Leadership  is  diffieult  to  teaeh,  even  for  people.  While 
there  is  evidenee  that  some  are  bom  with  an  aptitude  for 
leadership,  the  traits  and  skills  needed  to  be  an  effeetive 
leader  are  often  learned  only  by  experienee.  This  holds 
trae  aeross  a  diverse  set  of  domains,  ineluding  the 
eorporate  world,  sports,  firefighting  and  the  military, 
whieh  is  the  foeus  of  the  projeet  deseribed  in  this  paper. 
Given  that  the  military  needs  to  develop  a  large  number  of 
leaders,  it  is  imperative  to  find  ways  to  aeeelerate  the 
development  proeess  using  whatever  means  possible. 

The  U.S.  Army  defines  leadership  this  way: 

Leadership  is  influeneing  people  -  by  providing 
purpose,  direetion,  and  motivation  -  while  operating 
to  aeeomplish  the  mission  and  improving  the 
organization.  (FM  22-100,  1999,  p  1-4.) 

To  date,  most  of  the  Army’s  eomputer-based  training 
systems  for  leaders  use  eonstraetive  simulations,  whieh 
ereate  an  environment  where  eommanders  ean  praetiee 
mission  planning  and  taeties.  While  these  skills  are 
neeessary,  they  foeus  on  the  taetieal  and  teehnieal  aspeets 


of  the  job.  Learning  how  to  influenee  people,  how  to 
provide  purpose,  direetion  and  motivation  is  simply  not 
supported  by  most  eonstraetive  simulation  environments. 
While  reeent  researeh  on  virtual  humans  and  simulation 
attempts  to  address  these  issues,  (e.g.,  Riekel  et  ah,  2002), 
there  are  very  few  teehnieal  applieations  that  support  the 
development  of  a  deeper  understanding  of  interpersonal 
eommunieation,  building  a  positive  eommand  elimate, 
motivating  subordinates,  and  the  many  other  human 
dimension  faetors  that  define  an  effeetive  leader. 

Furthermore,  while  the  eurrent  generation  of  simulations 
ean  be  used  for  modeling  eonventional  warfare,  today’s 
military  leaders  faee  some  of  the  most  eomplex  and 
ehallenging  situations  imaginable.  To  a  greater  degree  than 
ever  before,  leaders  at  the  taetieal  level  -  eaptains, 
lieutenants  and  non-eommissioned  offieers  (NCO’s)  -  are 
being  eonfronted  with  situations  in  the  operational 
environment  where  their  loeal  deeisions  and  aetions  ean 
have  strategie  eonsequenees,  politieal  and  otherwise 
(MeCausland  &  Martin,  2001).  Over  the  past  deeade  the 
military  has  been  assigned  a  new  elass  of  missions 
requiring  an  expanded  set  of  skills.  Whereas  the  skills 
needed  for  war-fighting  depend  heavily  on  knowledge  of 
taeties  and  battle  drills,  the  new  missions  often  have  a 
different  set  of  requirements.  Peaeekeeping,  stability  and 
support  operations,  humanitarian  assistanee,  and  homeland 
defense  requires  knowledge  of  the  loeal  eulture  and 
polities,  as  well  as  skills  for  dealing  with  a  variety  of 
outside  organizations  sueh  as  non-governmental  groups, 
joint  forees  (inter-serviee  operations),  allied  eommands, 
and  host  nation  armed  forees. 

The  ehallenge  for  the  U.S.  armed  forees  is  to  develop 
leaders  who  have  not  only  mastered  the  taetieal  and 
teehnieal  skills  neeessary  to  be  eompetent  eommanders, 
but  to  be  effeetive  they  must  also  develop  intelleetual 
flexibility,  self-awareness,  adaptability,  and  be  able  to  deal 
with  ambiguity,  all  under  stressful  eonditions  (Klein,  1999; 
MeCausland  &  Martin,  2001;  Ulmer,  1998;  TRADOC, 
2003). 
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Learning  with  stories 

Knowing  how  to  motivate  a  subordinate,  how  to 
eommunieate  a  plan  (or  intent),  and  how  to  ereate  a 
eohesive  team  are  examples  of  skills  possessed  by 
effeetive  leaders.  Sternberg  eharaeterizes  these  skills  as 
tacit  knowledge  (Sternberg  et  ah,  2000),  whieh  is  a  form  of 
proeedural  knowledge;  it  is  praetieal  by  nature  and  not 
easily  verbalized,  and  its  mastery  leads  to  sueeess  in  a  field 
or  profession.  Sternberg  and  his  eolleagues  have  studied 
taeit  knowledge  in  a  wide  range  of  professions,  ineluding 
military  leadership.  To  understand  how  the  members  of  a 
profession  beeome  sueeessful,  stories  are  eolleeted  about 
problems  or  issues  and  the  solutions  that  were  either 
applied  or  learned  by  the  praetitioners.  These  stories  are 
then  used  to  identify  and  eategorize  the  taeit  knowledge 
that  leads  to  the  sueeessful  praetiee  of  the  trade.  In  the 
eontext  of  their  study  of  military  leaders,  Sternberg  et  al. 
developed  and  validated  an  inventory  of  taeit  knowledge 
for  military  leaders  that  differed  by  eehelon.  In  addilion, 
they  suggested  some  implieations  for  leader  development: 
(1)  use  the  taeit  knowledge  eategories  identified  in  their 
inventory  as  sourees  to  guide  the  experienees  of  a  leader, 
and  (2)  use  stories  that  illustrate  a  partieular  point  as  a 
launehing  point  for  an  interaetion  with  a  mentor  or  eoaeh. 
This  is  the  first  guiding  prineiple  of  our  applieation:  use 
stories  that  illustrate  a  situation  requiring  leadership  taeit 
knowledge  to  eonvey  an  experienee  to  a  learner.  In  faet  we 
look  this  prineiple  a  step  further  by  engaging  professional 
fdmmakers  to  eraft  the  telling  of  the  story. 

The  ehoiee  of  Hollywood  storytelling  as  a  vehiele  for 
establishing  a  taetieal  situation  and  for  exploring  key 
leadership  issues  was  informed  by  both  narrative  theory 
and  popular  eulture.  Soeietal  norms  have  long  been 
transmitted  through  narrative,  in  the  form  of  myths,  fables, 
and  fairy  tales.  The  ability  to  form  narratives  is  reeognized 
as  one  of  the  important  developmental  stages  in  ehildren, 
and  use  of  narrative  is  a  property  of  all  eultures,  not  only 
those  with  “advaneed”  eommunieation  skills. 

From  ehildhood  we  learn  that  storytelling  is  the  basis  for 
effeetive  eommunieation.  “When  I  was  your  age,  I  had  a 
little  red  wagon,”  a  parent  begins  a  tale  to  soothe  a  ehild 
over  the  loss  of  a  pet  goldfish.  An  alternate  approaeh,  a 
deseription  of  nature's  life  eyele,  though  teehnieally  more 
aeeurate,  is  less  emotionally  digestible.  Onee  the  situation 
is  framed  by  the  narrative,  however,  faetual  information 
ean  be  introdueed,  information  that  ean  affeet  the  listener's 
behavior  beyond  the  world  of  the  story.  ’ 

That  narrative  provides  a  more  engaging  proeess  of 
eommunieation  than  ehronologies  (events  delivered  in 
ehronologieal  order)  or  other  faet-based  formats  is  a  matter 
of  aneedotal  observation:  even  a  medioere  film  or  novel 


*  For  a  diseussion  of  the  effeets  of  narrative  on  real-world 
pereeptions,  see  Gerrig  (1993). 


laeks  the  nareotie  effeets  of  a  textbook  or  leeture.  Narrative 
theory  offers  a  deeper  explanation.  As  Lev  Manovieh 
observes  (2001),  the  reader/speetator  aetively  tests  a 
narrative,  making  assumptions,  aeeepting  or  rejeeting 
them,  filling  in  gaps  in  the  narrative  text,  and  ereating 
whole  eharaeters  out  of  the  sketehiest  of  traits.  Far  from 
passively  absorbing  a  narrative's  eontent,  the 
reader/speetator  enjoys  an  aetive  relationship  with  it.  In 
turn,  this  relationship  exereises  the  reader/speetator's  belief 
and  knowledge  systems: 

...fietions  often  have  their  effeet  beeause  they  eall 
forth  from  memory  real  world  events  and  eausal 
possibilities.  Even  when  the  import  of  the  original 
information  is  eaneeled  out  by  virtue  of  its  transparent 
fietionality,  the  rest  of  the  aeeessed-belief  strueture 
remains  intaet.  (Gerrig,  1993,  p.  231) 

By  leveraging  these  narrative  effeets  in  a  learning 
environment,  we  hypothesized  that  the  viewer  would  be 
engaged  on  the  multiple  levels  that  narrative,  and 
Hollywood,  are  known  for,  thereby  enhaneing  the 
experienee. 

Learning  through  discourse 

While  a  story  is  a  powerful  medium  for  eommunieating 
another’s  experienee,  a  mentor  ean  reinforee  the  salient 
points  to  be  learned  (Sternberg  et  al.,  2000).  It  has  long 
been  reeognized  that  students  learn  mueh  more  effeetively 
when  they  have  a  tutor  versus  what  they  learn  in  the 
elassroom.  Bloom  (1984)  showed  that  tutored  students 
seored  on  average  two  standard  deviations  higher  than 
students  who  were  taught  in  a  traditional  elassroom  setting. 
Chi  et  al.  (2001)  studied  what  makes  learning  with  human 
tutoring  effeetive  and  found  that,  among  other  things, 
tutoring  is  interaetive  by  nature.  Interaetivity  motivates  the 
student  more  than  passive  listening,  and  it  ean  result  in 
deeper  learning  by  promoting  student  explanation  and 
refleetion.  Effeetive  tutors  have  a  knaek  for  seaffolding  in 
a  dialogue,  whieh  leads  to  the  eonstmetion  of  new 
knowledge.  Graesser  et  al.  (2002)  also  suggest  that  getting 
the  student  to  ask  deep  questions  and  make  explanations 
helps  them  to  eonstmet  deep  knowledge. 

The  TLAC-XL  System 

To  eapitalize  on  the  effeetiveness  of  both  storytelling  and 
diseourse  to  aehieve  leadership  development  objeetives, 
we  developed  a  software  system  entitled  Think  Like  a 
Commander  -  Excellence  in  Leadership  (TLAC-XL).  The 
target  population,  eaptains  in  the  U.S.  Army,  internet  with 
the  system  in  a  straightforward  manner.  First,  they  are 
presented  with  a  short  movie  that  depiets  a  situation  where 
the  leadership  qualities  embodied  in  the  eharaeters 
influenee  how  the  situation  unfolds.  Seeond,  the  users 
engage  in  a  human-eomputer  dialogue  with  the  system 
about  the  leadership  issues  that  are  raised. 


The  dialogue  in  our  system  is  held  between  the  student  and 
a  synthetie  mentor,  as  well  as  with  some  of  the  eharaeters 
in  the  story.  After  viewing  the  vignette,  the  student  is 
asked  a  series  of  questions  by  a  synthetie  mentor,  whieh  is 
embodied  as  a  photo-real  animated  eharaeter.  The  format 
of  this  line  of  questioning  is  based  on  a  elassroom  teaehing 
methodology  developed  by  the  Army  Researeh  Institute 
(ARJ)  at  Ft.  Leavenworth,  Kansas,  known  as  Think  Like  a 
Commander,  or  TLAC  for  short.  The  purpose  of  the 
original  TLAC  format  was  to  habituate  eommanders  to  ask 
eight  eritieal  questions  when  faeing  any  operational 
seenario.  These  questions  eoneemed  the  mission,  the 
enemy,  the  terrain,  the  available  assets,  timing,  the  bigger 
pieture,  the  visualization  of  the  battlefield,  and  possible 
eontingeneies. 

The  original  TLAC  diseussion  format  has  been  used 
extensively  in  elassroom  settings  by  ARJ  and  the  Army  to 
teaeh  eommanders  eritieal  thinking  skills  about  taetieal 
situations.  Our  projeet  adapted  the  original  TLAC 
approaeh  by  first  engaging  the  student  with  a  question 
about  the  taetieal  seenario  portrayed  in  the  movie,  and  then 


raising  a  leadership  issue  related  to  the  topie.  For  instanee, 
the  mentor  initially  asks  questions  about  the  mission, 
beginning  with  the  student’s  interpretation  of  the  mission 
and  then  goes  on  to  ask  about  how  the  eharaeter  in  the 
story  appeared  to  interpret  the  mission.  The  mentor  then 
raises  a  leadership  issue  related  to  eurrent  TLAC  point, 
where  the  issue  is  assoeiated  with  a  eharaeter  in  the 
vignette.  This  leads  to  a  dialogue  between  the  student  and 
the  vignette  eharaeter.  Here  the  student  ean  ask  the 
eharaeter  questions  related  to  the  leadership  issue,  and  the 
eharaeter  responds  in  the  form  of  a  video  elip  that  is  most 
appropriate  for  the  question. 

Figure  1  presents  a  sereenshot  of  the  TLAC -XL  user 
interfaee.  The  synthetie  mentor  appears  in  the  lower  right 
of  the  sereen.  A  eharaeter  from  the  vignette  appears  in  the 
main  upper  left  window,  and  responds  to  questions  posed 
to  him  by  the  user. 

While  we  eall  the  interaetion  between  the  student  and  the 
mentor  and  the  student  and  the  eharaeters  a 
“eonversation,”  it  is  really  a  seripted  interaetion  that 
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the  same  •  . 

[mentor>  How  did  CSU  Pullman's  presence  influence  CPT  Young's  understanding  of  thi 
[mission ' and  the  commander's  intent?  ■ 

His  influence  made  a  difference  on  Capt .  Young's  decisions 

[mentor > -let 's  talk  with  CSM  Pullman  directly.  You  can  ask -him  questions  related 


Figure  1.  A  screenshot  of  Think  Like  a  Commander  -  Excellence  in  Leadership  (TLAC -XL) 


follows  the  TLAC  discourse,  while  allowing  a  great  deal 
of  flexibility  with  respect  to  providing  responses  to  the 
student  based  on  what  the  student  said.  The  student 
interacts  by  typing  questions  and  responses,  but  the  mentor 
and  the  characters  all  give  spoken  responses.  Thus,  the 
total  experience  of  the  student  is  comprised  of  watching  a 
movie,  interacting  with  a  mentor,  and  interviewing 
characters. 

Architecture 

The  TLAC -XL  system  presents  the  user  with  a  text-input 
console,  a  global  navigation  menu,  a  character  window  and 
a  mentor  window.  Users  can  interact  with  a  synthetic 
mentor  and  characters  from  the  movie  by  typing  questions 
into  the  console.  In  the  TLAC-XL  system  a  number  of 
research  efforts  needed  to  come  together  in  one  single 
application.  Due  to  the  heterogeneous  nature  of  all  the 
components  involved  in  the  resulting  application,  an 
architecture  was  needed  that  created  strong  interactive 
bonds  using  open-ended  software  links.  Various  control 
and  coordination  techniques  are  available  to  coordinate  the 
input  and  output  of  software  components  within  a  single  or 
a  distributed  system,  while  still  allowing  them  to  operate 
independently  of  each  other.  We  chose  a  TSpaces  based 
event  heap  coordination  architecture  (Johanson  &  Fox, 
2002)  for  our  system  for  a  number  of  reasons: 

•  Both  synchronous  and  asynchronous  events  can  be 
managed  within  the  same  control  structure; 

•  Components  attached  to  the  event  heap  do  not  need 
to  know  about  each  other.  This  makes  it  possible  to 
add  a  new  component  without  disturbing  any 
existing  knowledge  sources; 

•  The  event  heap  facilitates  a  global  interaction 
standard,  instead  of  custom  tailoring  each 
component  to  each  other; 

Under  most  circumstances  the  system  is  in  control  over  the 
navigation  between  mentor  and  characters.  Flowever  a 
method  was  needed  whereby  the  underlying  software 
fabric  could  re-route  input  and  output  between  components 
in  a  natural  way.  In  an  event-heap  based  architecture  a 
number  of  knowledge  sources  interact  with  each  other  by 
adding  and  reading  events  from  the  shared  data  space.  This 
event  heap  is  managed  by  a  control  structure  that  has 
control  over  the  distribution  of  events  among  all  the 
knowledge  sources  that  are  subscribed  to  the  event  heap. 
Our  control  structure  is  able  to  seamlessly  merge 
synchronous  and  asynchronous  events,  thus  allowing 
partial  scripts  to  be  interleaved  with  spontaneous  events. 
Behind  the  scenes  a  number  of  conversation  graphs 
coordinate  the  answers  of  our  virtual  actors  and  provide 
the  continuity  of  the  overall  dialog.  The  conversation 
graphs  were  originally  written  in  the  Java  programming 
language  while  the  main  TLAC-XL  application  was 
written  in  C-l-l-.  Our  event  heap  architecture  was  designed 
to  include  a  message  based  language  bridge  that  can 


communicate  with  the  event  heap  directly  as  a  knowledge 
source. 

Leadership  Scenario 

Students  begin  their  interaction  with  the  TLAC-XL  system 
by  watching  a  video  of  a  fictional  military  operation  where 
leadership  issues  arise.  In  our  first  TLAC-XL  system,  we 
authored  a  vignette  that  was  based  on  the  real  world 
experiences  of  U.S.  Army  captains.  We  began  by 
interviewing  a  group  of  ten  captains  stationed  at  the  United 
States  Military  Academy  at  West  Point.  All  of  these 
captains  had  recently  completed  a  tour  as  company 
commanders,  so  they  had  fresh  experiences  in  that  position 
that  was  conveyed  to  us  in  the  form  of  stories  that  we 
solicited  to  illustrate  their  points.  All  ten  captains  told  us 
about  some  of  their  most  salient  memories  as  commanders 
and  the  leadership  issues  they  faced.  Following  the 
interviews  we  brainstormed  ideas  for  a  current  operational 
scenario  that  could  be  used  as  the  basis  for  a  vignette. 
Based  on  this  input  we  developed  a  humanitarian 
assistance  vignette  that  takes  place  in  Afghanistan,  entitled 
Power  Hungry.  Working  with  subject  matter  experts  from 
the  Center  for  Army  Leadership  and  the  Army  Research 
Institute,  we  went  from  a  script  by  a  Hollywood  writer  to  a 
film  shot  in  a  mountainous,  desert-like  area  in  Southern 
California. 

In  the  scenario,  a  company  commander.  Captain  Young, 
has  been  given  the  mission  to  run  a  food  distribution 
operation  in  an  area  where  food  is  in  short  supply.  The 
company  quickly  runs  into  a  number  of  obstacles, 
beginning  with  how  to  secure  the  site  given  the  nature  of 
the  terrain  -  soft  soil,  located  in  a  bowl  surrounded  by  hills 
and  two  possible  entry  points.  It  is  necessary  to  create 
lanes  with  wire  to  keep  control  of  the  crowds  that  are 
expected  to  arrive  soon.  The  company’s  lieutenants  begin 
rigging  the  site,  but  their  plan  does  not  satisfy  the 
commander,  who  directs  the  executive  officer  to  start  over, 
giving  very  little  guidance  other  than  to  stall  the  food 
trucks  in  order  to  allow  time  to  prepare  the  site.  In  the 
mean  time  first  one  then  another  local  warlord  appears, 
offering  to  “help”  with  security.  Turning  away  the 
warlords  proves  difficult,  particularly  due  to  conflicting 
advice  from  a  brigade  command  sergeant  major  (CSM), 
who  happens  to  be  in  the  company’s  area  site  escorting  a 
media  crew.  The  brigade  CSM  plays  a  significant  but 
ambiguous  role  in  the  vignette.  He  offers  advice  that  seems 
to  suggest  that  he  has  some  inside  knowledge  about  the 
brigade  commander’s  intent.  His  advice  runs  counter  to  the 
commander’s  instincts  in  several  instances,  and  the  captain 
listens.  At  his  suggestion  the  commander  meets  with  one 
of  the  warlords  to  discuss  the  situation.  Meanwhile  the 
situation  worsens  as  the  executive  officer  is  unable  to 
delay  the  trucks,  and  after  some  twists  and  turns  in  the 
story,  the  warlords  hatch  their  plot  to  take  control  of  the 
food.  The  full  duration  of  the  Power  Hungry  vignette  is 
slightly  more  than  thirteen  minutes. 


This  vignette  was  authored  so  as  to  incorporate  six  specific 
leadership  issues  that  were  raised  by  the  U.S.  Army 
captains  that  we  interviewed.  While  each  of  these  issues 
involves  the  behavior  of  the  fictional  captain  in  our 
vignette,  the  vignette  was  authored  in  such  a  way  as  to 
associate  each  issue  with  a  different  character.  For 
example,  the  unexpected  presence  of  a  brigade  command 
sergeant  major  causes  some  problems  for  the  captain  in  the 
vignette  related  to  the  influence  that  is  brought  to  his 
command  decisions.  Flere  the  leadership  issue  is  one  that 
concerns  the  captain,  but  the  issue  is  associated  with  the 
character  of  the  command  sergeant  major  in  this  vignette. 
During  the  interactive  portion  of  the  TLAC-XL  system, 
students  are  given  the  opportunity  to  question  each  of  the 
characters  directly  about  the  leadership  issue  that  they  are 
associated  with.  The  six  leadership  issues  in  the  Power 
Hungry  vignette  are  as  follows: 

1 .  Shared  vision  of  intent  (LT  Perez) 

2.  Command  influence  (CSM  Pullman) 

3.  Setting  a  model  of  command  (LT  Wychowski) 

4.  Clarity  of  mission  (CPT  Young) 

5.  Cultural  awareness  (Omar  the  warlord) 

6.  Respect  for  experience  (SGT  Jones) 

Classification-based  conversations 

After  watching  the  video  of  the  vignette,  the  trainee  begins 
a  question-answer  dialogue  with  a  virtual  mentor.  The 
virtual  mentor,  visualized  as  a  photo-real  animated 
character,  poses  questions  to  the  student,  who  responds  by 
entering  natural  language  text  using  the  keyboard.  Within 
the  course  of  this  interaction,  the  virtual  mentor  introduces 
characters  from  the  vignette,  and  allows  the  student  to 
compose  questions  to  them  directly.  Responses  from  the 
vignette  characters  are  presented  as  video  recordings. 

In  each  dialogue  mode,  either  answering  questions  from 
the  mentor  or  asking  questions  of  vignette  characters, 
appropriate  responses  must  be  presented  to  the  trainee  to 
achieve  a  sense  of  coherence  in  the  dialogue  and  as  well  as 
pedagogical  goals.  To  accomplish  this,  we  follow 
statistical,  machine  learning  approach  for  processing  the 
natural  language  input  of  the  user.  At  any  point  in  the 
interaction  in  either  dialogue  mode,  there  are  a  fixed 
number  of  pre-authored  media  items  that  are  possible  to 
present  to  the  trainee,  each  of  which  would  move  the 
conversation  forward  one  turn.  The  task,  therefore,  is  to 
select  the  most  appropriate  member  of  the  set  of 
possibilities  given  the  trainee’s  textual  input.  By  using  a 
statistical,  machine  learning  approach,  where  the  trainee’s 
input  is  classified  based  on  the  available  supervised 
training  data  from  previous  users,  acceptable  levels  of 
performance  can  be  obtained  in  a  manner  that  is  robust  to 
slight  variations  in  language  use. 


Classification  algorithm 

To  perform  a  correct  classification  of  the  textual  input  of  a 
trainee  using  a  machine  learning  approach,  we  employ  a 
Naive  Bayesian  classification  algorithm  (George  & 
Langley,  1995)  implemented  in  the  WEKA  open  source 
toolkit  (Witten  &  Eibe,  1999).  To  construct  feature  vector 
instances  for  training  and  test  data,  we  treat  user  text  inputs 
as  a  set  of  features  consisting  of  individual  words 
(unigrams)  and  adjacent  pairs  of  words  (bigrams).  Feature 
vectors  are  constructed  for  instances  without  using  stop- 
lists  filters,  without  truncating  the  features  space,  by 
ignoring  punctuation  and  variation  in  case,  and  using 
feature  counts  for  feature  values,  although  feature  counts 
are  very  rarely  greater  than  one  for  a  given  instance. 

In  order  to  aid  in  the  development  of  an  operational 
prototype,  the  training  data  used  for  classification  of 
trainee  textual  input  was  seeded  with  training  examples 
fabricated  by  our  development  team  to  serve  as  a 
placeholder  in  the  absence  of  real  data  from  our  user 
population.  As  more  legitimate  data  was  being  collected,  it 
became  evident  that  the  seed  examples  were 
indistinguishable  from  the  real  data  in  form  and  content, 
and  were  retained  in  the  complete  training  data  set. 
Examples  of  the  seed  data  for  a  single  class  are  as  follows: 

Class:  Mission-intent 

What  was  your  understanding  of  the  mission? 

What  was  your  mission? 

What  do  you  think  the  purpose  of  this  operation  was? 

What  were  you  trying  to  accomplish  here  today? 

What  is  the  goal  of  this  food  distribution  operation? 

Did  you  understand  the  purpose  of  this  mission? 

Classification  performance 

To  evaluate  the  performance  of  this  approach  to  trainee 
input  classification,  a  cross-validation  analysis  (10-fold) 
was  performed  using  6  sets  of  supervised  training  data,  one 
for  each  of  the  classifiers  that  is  used  to  select  the  most 
appropriate  response  to  a  trainee’s  question  during 
character  interviews.  Although  both  the  mentor  interaction 
and  the  character  interviews  employ  the  same  classification 
approach,  the  mentor  interaction  was  structured  in  a  way 
where  there  were  at  most  two  possible  mentor  responses 
for  an  answer  typed  in  by  a  trainee  (corresponding  to 
agreement  or  disagreement).  In  contrast,  the  character 
interviews  are  much  more  demanding  on  the  classification 
algorithm,  where  there  are  an  average  of  13  possible 
character  responses  available. 

Figure  2  presents  the  results  of  the  cross-validation 
analysis  for  each  of  the  six  character  interview  classifiers 
used  in  our  system.  Accuracy  is  presented  as  the  likelihood 
that  a  novel  input  will  be  correctly  classified,  and 
performance  levels  for  the  initial  seeded  training  data  are 
presented  along  with  that  obtained  through  the  addition  of 


Character  classifier 

Classes 

Seed  instances 

Seed  accuracy 

Total  instances 

Total  accuracy 

Jones 

8 

48 

58.3% 

128 

62.5% 

Omar 

11 

66 

72.7% 

187 

68.4% 

Perez 

15 

90 

72.6% 

175 

73.1% 

Pullman 

13 

78 

62.3% 

221 

65.2% 

Wychowski 

10 

60 

58.3% 

142 

61.3% 

Young 

19 

114 

66.7% 

309 

63.8% 

Average 

12.67 

76 

65.15% 

193.67 

65.72%, 

Figure  2.  Character  Interview  Classifier  Performance  (10-fold  cross  validation) 


legitimately  collected  instances.  Interestingly,  the 
admittedly  modest  amount  of  legitimate  training  data  that 
we  have  been  able  to  collect  thus  far  has  not  significantly 
improved  the  level  of  performance  beyond  what  was 
obtained  using  the  initial  seed  data.  The  Naive  Bayesian 
learning  algorithm  outperformed  several  other  approaches 
that  we  evaluated  for  this  classification  task,  with  C4.5  rule 
induction  performing  almost  as  well.  However, 
contemporary  kernel  methods  and  support  vector  machines 
were  not  evaluated,  and  we  expect  that  greater 
performance  could  be  obtained  by  capitalizing  on  recent 
advances  in  these  methods. 

Conversation  graphs 

In  order  to  design  effective  interactions  between  trainees 
and  the  system,  we  encoded  the  set  of  possible 
trainee/system  dialogues  as  a  directed  finite-state  graph. 
Each  node  in  the  graph  represented  a  dialogue  turn  where 
the  system  said  something  (using  media),  and  each  arc  in 
the  graph  represented  a  classification  of  the  trainee’s  typed 
input.  Every  node  in  this  graph  that  has  more  than  one  arc 
transitioning  away  from  the  node  requires  a  separate 
classification  of  the  trainee  input.  The  section  of  this  graph 
representing  the  mentor  interactions  include  12  separate 
classifiers  for  this  purpose,  mainly  to  determine  whether  or 
not  the  mentor  should  agree  or  disagree  with  a  trainee’s 
response  to  a  mentor’s  preceding  question.  However,  each 
of  the  six  character  conversations  is  driven  by  a  single 
classifier,  which  selects  the  most  appropriate  answer  from 
the  character.  Graphical  representations  of  the  mentor 
graph  and  a  character  interview  graph  are  presented  in 
figures  3  and  4. 

As  seen  in  Figure  3,  the  mentor  interaction  can  be  viewed 
as  an  eight-tiered  interaction,  where  each  tier  corresponds 
to  a  line  of  questioning  that  concerns  one  of  the  eight 
Think  Like  a  Commander  (TLAC)  points  used  in  the 
previous  work  of  the  Army  Research  Institute.  Within  each 
tier,  the  mentor  begins  by  asking  a  few  preliminary 
questions  about  the  topic  (e.g.  “What  was  your 
understanding  of  the  mission?”)  that  lead  to  one  of  the  six 
critical  leadership  issues  that  were  brought  up  in  the 
vignette.  To  explore  these  leadership  issues  (if  necessary, 
based  on  the  user’s  response  to  a  poignant  question),  the 
mentor  will  allow  the  character  to  conduct  an  interview 
with  a  relevant  character  from  the  vignette.  Each  node 
labeled  with  a  letter  in  the  mentor  graph  indicates  a  point 


where  the  mentor  introduces  a  character,  invoking  an 
embedded  subgraph  corresponding  to  a  character 
interview.  At  the  end  of  an  embedded  character  interview, 
the  mentor  asks  a  follow-up  question  aimed  at  determining 
the  trainees  understanding  of  how  the  leadership  issue 
relates  to  the  given  Think  Like  a  Commander  point,  then 
moves  on  to  the  next  point. 

Figure  4  illustrates  the  general  shape  of  an  embedded 
subgraph  for  supporting  a  trainee-led  character  interview. 
A  single  classifier  is  used  to  route  a  trainee’s  question  to 


Figure  4.  A  character  interview  graph 


one  of  a  set  of  possible  eharaeter  responses.  The  embedded 
subgraph  is  used  repetitively  to  allow  the  trainee  to  ask 
multiple  questions,  until  they  indieate  to  the  system  that 
the  interview  is  over  by  means  of  a  user  interfaee  button. 
When  a  trainee’s  textual  input  is  elassified  to  the  same 
eategory  over  multiple  repetitions  (the  system  believes 
they  are  asking  the  same  question  twiee  or  more),  a 
seeondary  media  item  is  presented  to  the  user,  typieally 
where  eharaeter  states  that  they’ve  already  answered  that 
question,  and  they  have  nothing  more  to  say  on  the  matter. 

Animated  Mentor 

To  support  the  eonversational  interaetions  with  the  mentor, 
we  developed  an  animated  eharaeter  (Figure  5).  One  of  the 
requirements  for  our  eharaeter  is  that  he  should  look 
lifelike  and  engaging  to  the  trainees.  We  leveraged 
eomputer  graphies  teehnology  to  bring  this  eharaeter  to  life 
and  build  a  digital  talking  head  that  ean  be  animated  for  an 
arbitrary  input  sentenee.  Our  approaeh  falls  within  the 
realm  of  visual  speeeh  synthesis:  the  faeial  animation 
system  takes  as  input  a  speeeh  signal  and  output  the 
eorresponding  animation. 

Realistie  animation  of  a  synthetie  human  is  a  diffieult  task 
due  to  the  eomplexity  of  the  human  body,  one  that 
traditionally  involves  many  digital  artists  in  the  speeial 
effeets  industry.  We  took  advantage  of  motion  eapture 
teehnology  to  bring  realism  into  the  synthetie  mentor  at  an 
affordable  eost.  Motion  eapture  allows  the  aeeurate 
reeording  of  live  aetors'  motions.  We  used  this  teehnology 
to  reeord  a  large  database  of  speeeh  related  motions  from  a 
live  aetor.  We  then  analyzed  this  data  to  build  a  generative 
statistieal  model  of  these  aetor's  faeial  motions.  This  model 
used  the  database  of  motions  indexed  with  speeeh.  We 
organized  this  database  aeeording  to  the  phonemes  of  the 
reeorded  speeeh:  eaeh  phoneme  is  assoeiated  with  a  large 
number  of  motion  fragments. 


Figure  5.  Synthetie  Mentor 


To  generate  animations  from  our  model,  given  an  input 
speeeh,  we  first  segment  it  into  phonemes.  This  string  of 
phonemes  is  then  used  as  a  guideline  to  extraet  from  the 
motion  database  a  eorresponding  sequenee  of  motion 
fragments.  The  motion  fragments  are  optimally  ehosen  to 
maximize  the  fidelity  of  the  synthesized  motion.  We  stiteh 
the  sequenee  together  to  produee  a  faeial  motion  that  both 
matehes  the  input  speeeh  and  is  visually  realistie. 

Results 

At  the  time  this  paper  was  written,  two  sets  of  evaluations 
have  been  eondueted  by  the  Army  Researeh  Institute  to 
study  the  effeetiveness  of  the  TLAC-XL  system.  The  first 
eonsisted  of  an  initial  series  of  fomiative  evaluations  at  Ft. 
Lewis,  WA,  aimed  at  developing  the  evaluation  method 
itself  As  TLAC-XL  involved  a  non-traditional  interaetion 
with  students  and  subtle  training  objeetives,  it  was 
neeessary  to  investigate  appropriate  teehniques  for 
obtaining  pre-test  and  post-test  data  from  subjeets.  This 
first  evaluation  provided  us  with  one  speeifie  and 
unexpeeted  result.  In  most  military  training  seenarios  the 
final  outeome  of  the  operation  is  overwhelmingly  positive. 
Flowever,  our  story  ends  in  a  failure  of  the  mission.  As  a 
eonsequenee,  our  test  subjeets  were  highly  disgruntled  by 
what  they  saw,  in  most  eases.  At  first,  the  evaluation  team 
viewed  this  negative  response  as  an  apparent  failing  of  the 
system.  Flowever,  the  agitation  expressed  in  our  subjeets 
appeared  to  support  the  interaetion  that  oeeurred  after 
watehing  the  story.  Most  test  subjeets  used  the  interaetive 
portion  of  the  session  to  vent  their  frustrations  eoneeming 
the  mission  to  the  virtual  mentor  and  virtual  eharaeters. 

A  seeond  set  of  evaluations  was  performed  at  Ft.  Drum, 
NY.  Flere,  more  evidenee  was  gathered  to  suggest  that  the 
frustration  evoked  by  watehing  the  vignette  ean  provide  a 
strong  foree  for  learning,  leading  our  subjeets  (U.S.  Army 
eaptains)  into  heated  diseussions.  In  this  set  of  evaluations, 
subjeets  would  spend  1  1/2  hours  to  2  hours  with  the 
system  on  average,  and  engage  in  additional  diseussions 
with  evaluators  eoneeming  various  possible  outeomes  and 
solutions.  To  evaluate  the  relative  value  of  guided 
eonversations  with  interaetive  eharaeters  versus  traditional 
elassroom  methods,  a  eomparison  was  eondueted  between 
TLAC-XL  and  a  slideshow  version  of  the  seenario.  Early 
results  of  this  eomparison  suggest  that  the  slideshow 
variation  was  effeetive  at  presenting  the  seenario  in  a  way 
that  enabled  students  to  remember  faets  about  the  mission. 
Flowever,  subjeets  using  the  TLAC-XL  applieation  had  an 
additional  understanding  of  the  interpersonal  dynamies  that 
eontributed  to  the  failure  of  the  mission  that  went  beyond 
the  faetual  details  of  the  seenario. 

Through  these  and  other  evaluations,  we  have  learned  a 
number  of  lessons  about  the  guided  eonversations.  When 
students  ask  questions  within  the  seope  supported  by  the 
eonversation  graph,  the  answers  ean  appear  to  be  highly 


realistic  and  engaging.  When  students  ask  questions  of 
virtual  characters  that  are  outside  the  expected  scope,  the 
irrelevant  answers  that  are  given  in  response  can  be 
frustrating  to  the  student,  but  can  also  give  the  appearance 
that  the  character  is  simply  avoiding  the  question.  Also,  it 
appears  that  failures  in  classifying  students’  questions  can 
be  mitigated  somewhat  by  responding  with  engaging 
content.  That  is,  the  students  may  be  less  frustrated  with  a 
character  response  that  is  not  relevant  to  their  question  as 
long  as  it  is  interesting  in  its  own  right  and  relevant  to  the 
larger  topic  of  conversation. 

The  TLAC-XL  system  has  been  demonstrated  to  a  broad 
range  of  U.S.  Army  officers  ranging  in  rank  from 
lieutenant  to  general.  The  universal  reaction  to  the  vignette 
has  been  that  it  is  very  engaging  and  stirring.  Besides  good 
storytelling,  one  of  the  reasons  we  believe  that  the  vignette 
has  been  so  well  received  is  that  it  hits  several  areas  that 
the  Army  currently  needs  to  cover  in  leader  development, 
but  does  not  have  any  technological  support.  The  scenario 
encompasses  a  contemporary  operational  environment,  a 
food  distribution  operation  in  Afghanistan,  which  is  in  the 
Army’s  new  spectrum  of  operations.  Furthermore,  it  raises 
cross-cultural  issues,  interpersonal  communication, 
command  climate,  and  a  number  of  the  other  human 
dimensions  of  leadership. 

Future  Work 

There  is  a  lot  of  work  we  would  still  like  to  do  on  this 
project.  To  more  fully  support  deep  learning  we  plan  to 
take  seriously  the  need  for  student  modeling,  analysis  of 
the  input,  and  providing  customized  feedback.  In  addition 
we  plan  to  incorporate  tutoring  strategies  based  on  the 
kinds  of  questions  asked  by  participants.  It  has  been 
observed  by  our  ARI  colleagues  that  less  experienced 
leaders  may  not  have  the  ability  to  ask  the  right  questions. 
A  skilled  tutor  knows  how  to  ask  telling  questions  in  these 
instances,  to  prompt  the  generation  of  a  more  focused 
question  that  may  not  have  been  considered  otherwise.  In 
addition,  we  plan  to  expand  the  capabilities  of  the 
animated  tutor  to  incorporate  text-to-speech  technology, 
enabling  an  even  greater  degree  of  customization.  At  the 
prompting  of  our  colleagues  in  the  Army,  we  plan  to 
provide  multiple  identities  for  the  mentor  to  represent  other 
races  and  genders. 
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